Multi-band summary correlogram-based pitch detection for noisy speech

نویسندگان

  • Lee Ngee Tan
  • Abeer Alwan
چکیده

A multi-band summary correlogram (MBSC)-based pitch detection algorithm (PDA) is proposed. The PDA performs pitch estimation and voiced/unvoiced (V/UV) detection via novel signal processing schemes that are designed to enhance the MBSC’s peaks at the most likely pitch period. These peak-enhancement schemes include comb-filter channel-weighting to yield each individual subband’s summary correlogram (SC) stream, and stream-reliability-weighting to combine these SCs into a single MBSC. V/UV detection is performed by applying a constant threshold on the maximum peak of the enhanced MBSC. Narrowband noisy speech sampled at 8 kHz are generated from Keele (development set) and CSTR – Centre for Speech Technology Research-(evaluation set) corpora. Both 4-kHz fullband speech, and G.712-filtered telephone speech are simulated. When evaluated solely on pitch estimation accuracy, assuming voicing detection is perfect, the proposed algorithm has the lowest gross pitch error for noisy speech in the evaluation set among the algorithms evaluated (RAPT, YIN, etc.). The proposed PDA also achieves the lowest average pitch detection error, when both pitch estimation and voicing detection errors are taken into account. 2013 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations

of the Dissertation Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations by Lee Ngee Tan Doctor of Philosophy in Electrical Engineering University of California, Los Angeles, 2014 Professor Abeer Alwan, Chair This dissertation focuses on algorithms for robust speech and bird song processing. Many applications perform well under ideal signal conditions,...

متن کامل

Monaural Voiced Speech Segregation Based on Pitch and Comb Filter

The correlogram is an important mid-level representation for periodic sounds which is widely used in sound source separation and pitch detection. However, it is very time consuming. In this paper, we presented a novel scheme for monaural voiced speech separation without computing correlograms. The noisy speech is firstly decomposing into time-frequency units. Pitch contour of the target speech ...

متن کامل

Pitch estimation of noisy speech signals using empirical mode decomposition

This paper presents a pitch estimation method of noisy speech signal using empirical mode decomposition (EMD). The normalized autocorrelation function (NACF) of the noisy speech signal is decomposed into a finite set of band-limited signals termed as intrinsic mode functions (IMFs) using EMD. The periodicity of one IMF is supposed to be equal to the accurate pitch period. A conventional autocor...

متن کامل

Pitch Tracking Based on Statistical Anticipation

An effective multi-pitch tracking algorithm for noisy speech is critical for auditory processing. However, the performance of existing algorithms is not satisfactory. We have developed a robust algorithm for multi-pitch tracking of noisy speech based on statistical anticipation. By combining an improved channel and peak selection method, a new integration method for extracting periodicity infor...

متن کامل

An Automatic Pitch Detection Method Based on Multi-feature for Mandarin Speech

There are many traditional pitch detection methods, but most of them can’t perform perfectly for different speakers, applications and environmental conditions. For this reason, a pitch detection method based on multi-feature is proposed. Firstly, the speech signals are pre-filtered. Secondly, the speech signal pre-filtered is segmented into syllables. Finally, the pitch period is obtained by wa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 55  شماره 

صفحات  -

تاریخ انتشار 2013